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Introduction 

In  this  report,  we  will  summarize  the  work  supported  by  AFOSR  over  the  past  few  years. 
Producing  a  report  of  vast  length  does  not  seem  advisable,  so  the  emphasis  is  on  the  word 
“summarize”.  We  would,  of  course,  be  happy  to  provide  great  detail  about  any  of  the 
projects  described  here.  The  report  will  be  organized  aroimd  the  three  aims  of  the  grant. 
Within  each  aim,  we  will  organize  the  information  around  the  papers  and  manuscripts  that 
have  resulted  from  this  work. 


AIM  1:  Transcending  the  serial/parallel  dichotomy  in  visual  search:  Guided 
Search,  our  model  of  human  visual  search  behavior,  has  proposed  that  "preattentive" 
visual  processes  guide  the  deployment  of  attention  from  item  to  item  in  a  serial,  item- 
by-item  fashion.  Others  have  argued  for  deployment  of  attention  to  multiple  items  in 
parallel.  These  views  have  been  seen  as  opposed  to  one  another.  The  work  in  this  aim 
is  intended  to  reconcile  them  in  a  single  framework. 


The  primary  synthesis  of  our  current  views  on  “Guided  Search”  can  be  found  in: 


Wolfe,  J.  M!  (2007).  Guided  Search  4,0:  Current  .Progress  With  A  Model  Of  Visual 
jlS^ch.  In  W.  Gray  ^&)^Inte^ated  Models  of  (^gnitive  99-119).  New 


Our  primary  interest  is  in  visual  search  tasks.  These  are  tasks  where  an  observer  looks  for 
some  target  in  a  display  containing  distractors.  “Classic”  Guided  Search  (Wolfe,  1994; 
Wolfe,  Cave  &  Franzel,  1989)  is  a  two-stage  model  of  visual  search.  With  support  from 
AFOSR,  we  developed  Guided  Search  4.0  (GS4).  The  basic  architecture  is  shown  here: 


Non-selective  processing 


Figure  One:  The  large-scale  structure  of  GS4.  See  text  for  details. 

Using  the  numbers  on  the  figure  as  reference:  Parallel  processes  in  early  vision  (1)  provide 
input  to  object  recognition  processes  (2)  via  a  mandatory  selective  bottleneck  (3).  The 
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parallel  processes  can  provide  information  about  a  limited  set  of  attributes  (Wolfe  & 
Horowitz,  2004).  Below,  we  discuss  our  most  recent  contributions  to  the  understanding  of 
that  set.  The  bottleneck  allows  one  “preattentive  object  file”  (Wolfe  &  Bennett,  1997)  at  a 
time  to  pas  to  the  object  recognition  processes  at  a  rate  of  about  one  every  50  msec. 
Another  way  to  describe  this  is  that  the  front-end  of  the  system  creates  a  “priority  map” 
(Serences  &  Yantis,  2006)  that  represents  the  preattentive  guess  about  the  likely  location(s) 
of  targets.  The  bottleneck  represents  a  winner-take-all  (WTA)  process  that  passes  the  most 
likely  item  to  the  next  stage.  (For  recent  evidence  for  the  WTA  nature  of  selection,  see 
Zenon,  Hamed,  Duhamel  &  Olivier,  2009) 

We  model  object  recognition  as  a  diffusion  process  (Ratcliff,  1978;  Ratcliff,  Gomez  & 
McKoon,  2004).  The  details  of  object  recognition  are  outside  the  scope  of  our  work  (and  a 
very  hard  problem,  altogether).  Diffusion  of  the  information  required  to  identify  a  target 
takes  place  over  several  hundred  msec.  This  means  that,  although  items  are  selected  into 
the  object-recognition  diffuser  one  at  a  time,  several  objects  will  be  in  the  process  of  being 
identified  at  the  same  time.  As  a  consequence,  GS4  is  a  hybrid  serial/parallel  model 
(Moore  &  Wolfe,  2001 ;  Wolfe,  2003). 

As  noted,  the  guidance  in  Guided  Search  is  the  use  of  information  from  early  visual 
processes  to  guide  access  to  the  selective  bottleneck  (3).  In  GS4,  guidance  is  imagined  as  a 
control  device,  sitting  to  one  side  of  the  pathway  from  input  to  object  recognition  (4).  The 
reason  for  this  is  that  the  properties  of  guidance  turn  out  to  differ  from  the  processes  that 
give  rise  to  perception  and  this  is  hard  to  explain  if  guidance  is  in  the  main  pathway.  To 
give  a  very  recent  example,  we  have  had  Os  search  for  desaturated  (e.g.  pink,  light  blue) 
targets  among  saturated  (e.g.  red,  blue)  and  white  distractors.  We  carefUlly  equated  the 
perceptual  distances  between  targets  and  distractors.  Thus,  the  perceptual  distance  from 
pink  to  white  was  the  same  as  the  perceptual  distance  from  light  blue  to  white,  for 
example.  Interestingly,  search  for  pink  among  red  and  white  was  hundreds  of  msec  faster 
than  search  for  light  blue  among  blue  and  white  (Kuzmova  et  al.,  2008).  For  present 
purposes,  the  point  is  that  pink’s  ability  to  guide  differs  from  its  perceptual  salience. 

GS1-GS3  were  single  pathway  models.  In  GS4,  we  model  visual  processing  as  having  two 
pathways:  a  selective  pathway  and  a  non-selective  pathway  (5).  The  non-selective  pathway 
is  not  subject  to  the  bottleneck  in  the  selective  pathway  (3).  It  is  capable  of  processes  like 
analysis  of  texture  statistics  (Ariely,  2001)  (Chong  &  Treisman,  2003)  and  even  some 
crude  semantic  analysis  of  scenes  (“beach”  or  “city  street”,  not  “comer  of  4*  and  Main”) 
(Oliva  &  Torralba,  2001)  (Oliva  2005;  Potter,  Staub,  &  O'Connor,  2004).  It  is  not  capable 
of  object  recognition  (Evans  &  Treisman,  2005)  (Walker,  Stafford,  &  Davis,  2008).  There 
is  support  for  these  ideas  in  recent  neural  data  (Peelen,  Fei-Fei,  &  Kastner,  2009). 

It  is  probable  that  some  relatively  late  information,  perhaps  semantic  information,  can 
influence  guidance  (6).  Examples  include  (Henderson,  Brockmole,  Castelhano,  &  Mack, 
2007;  J.  M.  Henderson  &  Ferreira,  2004;  Hidalgo-Sotelo,  Oliva  ,  &  Torralba,  2005;  Vo  & 
Henderson,  2009).  This  is  implied  in  models  like  Ahissar  and  Hochstein’s  “Reverse 
Hierarchy  Model”  (Ahissar  &  Hochstein,  2004;  Hochstein  &  Ahissar,  2002).  We  continue 
to  investigate  this  topic. 
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GS4  envisions  at  least  two  bottlenecks  in  processing  since  the  selective  bottleneck  in  visual 
search  (3)  seems  to  be  separable  from  the  bottleneck  (7)  that  produces  effects  like  the 
attentional  blink  ("AB"  in  the  figure,  see  Shapiro,  1994).  For  example,  while  some  scene 
processing  may  be  possible  via  a  non-selective  pathway,  perception  of  that  scene  seems  to 
be  fully  blocked  by  the  attentional  blink  (Marois,  Yi,  &  Chun,  2004) 

The  GS4  chapter  describes  the  results  of  simulations  that  mimic  the  main  results  of  visual 
search  experiments.  Notably,  it  produces  RT  distributions  that  are  qualitatively  similar  to 
the  data.  Part  of  our  work  has  been  to  better  characterize  RT  distributions  in  search.  This 
work  is  summarized  in  two  papers: 


Distributions  ip  \lsual  SetlcJi  Tasks  •*  I  ?V^at  Do  RT  Distr^Utilhs 


There  is  theoretically  useful  information  in  the  distribution  of  reaction  times  in  visual 
search  tasks.  However,  two  difficulties  stand  in  the  way  of  exploiting  RT  distributions. 
First,  there  are  too  few  trials  in  a  typical  dataset  and  second,  unlike  means,  RT 
distributions  are  not  trivial  to  combine  across  observers.  We  addressed  both  of  those 
problems  in  a  pair  of  papers.  First,  we  collected  a  very  large  data  set,  running  ten  observers 
on  1000  trials  at  each  of  4  set  sizes  in  each  of  three  search  tasks:  The  search  tasks  were  a 
feature  search,  with  the  target  defined  by  color;  conjunction  search,  with  the  target  defined 
by  a  combination  of  color  and  orientation;  and  spatial  configuration  search,  where  the 
target  was  a  2  among  distractor  5s. 


This  large  data  set  allows  us  to  characterize  the  RT  distributions  in  detail.  Figure  2  (from 
the  paper)  shows  the  individual  target  present  and  absent  distributions  for  ten  observers 
performing  the  conjunction  task: 
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Figure  Two:  Set  size  versus  response  time. 

Note  that  the  distributions  all  have  similar  positively  skewed  forms.  White  lines  are  target 
present  distributions.  Black  lines  are  target  absent.  Moreover,  the  present  and  absent 
distributions  tend  to  overlap  extensively  (in  ways  that  are  puzzling  from  the  point  of  view 
of  the  effort  to  model  target-absent  trials). 

We  fit  several  psychologically  motivated  fimctions  (ex-Gaussian,  ex -Wald,  Gamma,  and 
Weibull)  to  the  data.  All  fit  reasonably  well  so  if  a  model  had  a  theoretically  motivated 
reason  to  believe  that  the  distribution  should  have  a  specific  form,  these  data  would  be 
supportive.  That  said,  it  is  not  obvious  that  these  distributions  should  be  fit  by  any  very 
simple  fimction.  After  all,  the  RT  will  be  the  product  of,  at  least,  initial  visual  processing, 
search,  and  the  motor  output.  Each  of  these  processes  will  have  its  own  temporal  character 
and  there  is  no  reason  to  assmne  that  the  concatenation  of  the  processes  will  result  in  some 
simple  distribution.  Of  more  importance  are  the  more  qualitative  statements  that  can  be 
made  on  the  basis  of  the  RT  distribution  data. 

In  order  to  make  such  statements,  we  developed  a  non-parametric  normalization  procedure, 
the  “x-score  transform”,  that  allows  us  to  compare  distributions  via  quantile  alignment. 
The  x-score  transform  aligns  the  25*  and  75*  percentiles  of  a  distribution  to  any  two 
arbitrary  values  (e.g.,  -1  and  +1,  respectively).  This  procedure  removes  linear  scaling 
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differences  in  distributions  while  preserving  non-linear  properties  such  as  skew  and 
kurtosis.  The  x-score  transform  tends  to  isolate  the  shapes  of  distributions  regardless  of 
their  original  mean  and  variance.  In  the  second  RT  distribution  paper,  we  applied  the  x- 
score  transform  to  the  distributions  from  feature,  conjunction,  and  spatial  configuration 
search  data  from  the  first  paper.  We  used  an  iterative  Kolmogorov-Smimov  cluster 
analysis  to  determine  which  distributions  should  be  combined  or  kept  separate.  The  most 
striking  finding  is  that,  while  there  are  some  reliable  differences  between  specific 
conditions,  the  great  majority  of  these  normalized  distributions  cluster  together  and  share  a 
common  underlying  shape  across  variations  in  set  size  and  target  presence  or  absence.  This 
finding  has  implications  for  theories  of  search.  For  example,  if  your  model  predicts  that  RT 
distributions  should  change  in  shape  as  a  function  of  set  size,  your  model  is  wrong. 
Regrettably,  it  is  hard  to  find  a  model  that  does  not  predict  such  a  change.  This  remains  an 
interesting  problem  for  modeling.  The  easy  way  out  would  be  to  imagine  that  the  shape  of 
the  RT  function  is  driven  largely  by  non-search  components  but  this  does  not  seem  terribly 
plausible. 

We  have  posted  the  data  from  these  papers  on  our  website  so  that  anyone  else  who  wants 
to  model  RT  distributions  will  have  the  distributions  to  model. 
http :  // search. bwh. harvard.  edu/n.ew/dat a  set . html 


Pahnefeiiftipa 

liiiiiBljBMl 

iPii 

■  \^rU:0  alAipy  «**  I 

:Rfrle?f.^MI||iijfcrcqj 

BMI 

This  paper  represents  another  effort  to  make  a  qualitative  distinction  between  two  broad 
classes  of  visual  search  models.  Attention-limited  models  propose  two  stages  of  perceptual 
processing:  an  imlimited  capacity  preattentive  stage  and  a  limited-capacity  selective 
attention  stage.  Conversely,  noise-limited  models  propose  a  single  unlimited  capacity 
perceptual  processing  stage,  with  decision  processes  limited  by  perceptual  signal  quality. 

In  this  study,  we  arranged  for  a  feature  search  to  be  harder  than  a  spatial  configuration 
search  for  a  set  size  of  one  (not  really  a  search  at  that  point).  Stimuli  were  presented  briefly 
so  the  measures  of  interest  were  accuracy  and  related  signal  detection  measures,  notably, 
d’.  Now  consider  what  should  happen  if  set  size  is  increased.  A  single  stage,  parallel 
processing  model,  will  predict  that,  if  task  A  is  harder  than  B  at  set  size  1,  it  must  remain 
harder  at  all  set  sizes.  In  contrast,  a  two-stage  model  (e.g.  Guided  Search)  makes  a 
different  prediction.  Performance  on  the  easier  spatial  configuration  search  degrades  more 
rapidly  as  set  size  goes  up  because  the  initial  guidance  stage  does  nothing  useful  for  that 
search.  By  contrast,  front-end  guidance  can  help  a  second  decision  stage  in  the  feature 
search  task.  The  prediction  would  be  a  crossover  interaction  with  spatial  configuration 
easier  at  small  set  sizes  and  feature  search  easier  at  larger  set  sizes. 
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Physical  Set  Size  Manipulation 


Attentional  Set  Size  Manipulation 


Figure  Three:  Results  from  the  Palmer  et  al.  ‘‘cross  over”  experiment. 

Figure  Three  shows  the  results  from  two  versions  of  this  experiment.  The  data  show  the 
clear  cross-over  interaction  predicted  by  Guided  Search.  The  useful  feature  of  this 
experiment  is  its  qualitative  nature.  A  cross-over  interaction  rules  out  a  whole  class  of 
models  without  the  need  engage  in  detailed  curve  fitting  or  parameter  estimation.  If  your 
model  proposes  that  search  for  an  item  of  one  orientation  and  search  for  a  2  among  5s  are 
both  done  by  a  parallel  process  with  a  single  decision  rule,  then  these  data  will  be  a 
problem  for  your  model. 


Consider  a  search  for  a  red  letter  “T”  among  red  and  black  letters.  How  is  attention  guided 
to  red  items?  There  are  several  views  of  how  this  guidance  might  be  implemented  in  the 
visual  system.  First,  it  could  be  that  top-down  guidance  acts  like  a  filter,  placed  across  the 
input  stream  for  an  entire  block  of  trials,  effectively  passing  only  items  of  the  correct  color 
as  candidates  for  attention  as  in  Figure  4,  below. 
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Figure  Four:  Guidance  as  a  filter 


Alternatively,  all  items  might  be  initially  passed  up  the  visual  pathway,  with  top-down 
guidance  intervening  only  after  a  delay  or  in  some  feedback  process  to  weed  out  items  of 
the  wrong  color. 

Figmes  Five  and  Six  illustrate  an  approach  to  this  problem  that  we  took  in  a  set  of 
experiments.  The  observer  was  faced  with  a  set  of  Cs  in  four  possible  orientations.  All  but 
one  opened  up  or  down.  The  target,  present  on  each  trial,  opened  to  the  left  or  right,  and 
observers  were  asked  to  report  its  orientation.  Each  C  was  placed  on  a  colored  disk. 
Suppose  that  observers  knew  that  the  target  was  always  on  a  gray  disk  (We  used  color  in 
the  actual  experiment).  If  the  top-down  guidance  to  gray  acted  like  a  persistent  filter,  then 
that  filter  should  reduce  the  effective  set  size  to  the  set  of  four  gray  items.  If  no  guidance 
were  available,  this  would  be  a  search  through  the  16  Cs  on  the  left  side  of  the  figure. 

To  examine  the  temporal  dynamics  of  guidance,  we  varied  the  time  of  onset  of  the  colors 
relative  to  the  onset  of  the  Cs  (Figure  Five). 
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Figure  Five:  Hypothetical  outcomes  for  the  time-to-guide  experiments.  Solid  line 
assumes  that  color  guidance  is  available  as  soon  as  the  color  is  available.  Dashed  line 
assumes  that  guidance  begins  about  300  msec  after  the  color  becomes  available. 

Consider  a  simple,  two-state  model.  At  any  moment,  observers  are  either  searching 
through  all  16  items  in  an  unguided  manner,  or  they  are  restricting  their  search  to  the  four 
items  of  the  target  color.  For  simplicity,  assume  that  there  is  a  sharp  transition  between 
those  two  states.  Suppose  that  the  Cs  appear  400  msec  before  the  color  cue  (SOA  =  -400  in 
Figure  Five).  When  the  Cs  appear,  observers  must  begin  by  searching  through  16  items 
because  there  is  no  guiding  information.  After  400  msec,  the  color  information  appears. 
Once  it  becomes  effective,  this  becomes  a  search  through  four  items.  The  RT,  therefore,  is 
a  mixture  distribution  of  some  purely  imguided  searches,  when  the  observer  finds  the 
target  before  the  color  ever  appears,  and  some  that  benefit  from  eventual  guidance.  As  the 
SOA  becomes  increasingly  negative,  there  is  a  greater  chance  that  the  search  will  finish 
before  the  color  becomes  available.  At  the  longest  negative  SOAs,  RTs  should 
approximate  the  16-item  baseline,  the  time  required  to  find  a  target  when  there  is  no  color 
guidance.  The  four-item  baseline  is  the  RT  for  an  unguided  search  through  a  set  of  just 
four  items. 

The  solid  curved  line  in  the  bottom  half  of  Figure  Five  illustrates  the  prediction  if  guidance 
starts  as  soon  as  the  guiding  information  is  presented  (solid  line).  For  any  positive  SOA, 
the  task  looks  like  search  through  just  4  items.  The  dashed  line  shows  what  the  results 
would  look  like  for  a  hypothetical  300  msec  delay  in  guidance.  The  search  doesn’t  look 
like  a  4-item  search  until  the  colors  precede  the  Cs  by  300  msec. 
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Cue  -  Search  SOA 

Figure  Six:  Results  for  one  set  of  time-to-guide  experiments.  Black  line  shows  data 
from  Blocked  trials.  Gray  line  shows  data  for  Mixed  trials  with  consistent  mapping  of 
target  and  distractor  colors.  The  dashed  line  shows  data  for  Mixed  trials  with 
inconsistent  mapping. 


Figure  Six  shows  average  of  the  median  RTs  for  15  observers.  Each  observer  was  tested 
for  30  trials  at  each  of  13  SOAs.  Each  observer  also  completed  27  trials  of  imguided  search 
with  set  sizes  of  4  and  16  items  to  establish  the  baselines,  plotted  as  horizontal  lines  at  their 
median  value. 


The  most  important  data  are  those  plotted  in  black.  They  show  the  results  for  a  blocked 
condition  where  observers  knew  that  the  4  item  subset  containing  the  target  would  always 
be  the  same  color  (e.g.  “Look  for  red”).  The  12  distractors  also  preserved  the  same 
distractor  color  for  the  entire  block.  Performance  at  SOA  0  is  significantly  above  the 
baseline.  This  argues  that,  even  imder  conditions  when  an  observer  can  maintain  the  same 
‘guiding  principles’  for  an  entire  block  of  hundreds  of  trials,  guidance  takes  time  to 
develop  on  any  given  trial.  Apparently,  when  the  stimuli  first  appear,  everyone  is  passed 
through  as  a  candidate  target.  Only  after  200-300  msec  does  guidance  become  fully 
effective. 


The  gray  line  shows  a  consistent  mapping  condition  in  which  Os  knew  that  some  colors 
(e.g.  red,  purple,  blue)  were  target  colors  and  others  (e.g.  yellow,  green,  cyan)  were 
distractor  colors  but  where  the  specific  colors  could  vary  from  trial  to  trial.  This  produces 
similar  results  to  the  blocked  condition.  In  contrast,  the  dashed  line  shows  data  from 
conditions  where  Os  knew  only  that  the  target  would  be  in  the  smaller  color  subset  but  the 
colors  changed  at  random  from  trial  to  trial.  Under  these  conditions,  full  guidance  took 
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much  longer  to  establish.  This  can  be  considered  to  be  a  version  of  a  switch  cost  (Wylie  & 
Allport,  2000)  (Mayr  &  Kliegl,  2003). 


Guiding  Features 

As  part  of  the  effort  to  understand  guidance,  we  carried  out  several  series  experiments  on 
the  attributes  that  might  guide  attention.  We  describe  these  briefly  here: 


Horowitz, 


We  know  that  motion  is  a  basic  guiding  attribute  (e.g.  find  the  moving  item  among 
stationary)  (Dick,  Ullman,  &  Sagi,  1987;  McLeod,  Driver,  &  Crisp,  1988).  Here  we  asked 
if  it  is  possible  to  search  for  items  based  on  their  type  of  motion?  We  examined  three  types 
of  motion:  1)  ballistic  motion,  in  which  objects  move  in  a  straight  line  until  they  encounter 
an  obstacle;  2)  random  walk  motion,  in  which  objects  change  direction  randomly;  3) 
composite  motion,  in  which  objects  move  with  random  fluctuations  around  a  generally 
ballistic  trajectory.  The  data,  a  complicated  pattern  of  search  asymmetries,  can  be  modeled 
if  we  assume  that  Os  can  guide  attention  using  processes  sensitive  to  the  presence  of  linear 
motion  and  change  in  motion.  The  results  do  not  support  the  idea  that  we  have  a  more 
sophisticated  ability  to  segregate  items  based  on  the  nature  of  their  motion. 


In  these  experiments,  we  considered  whether  attention  could  be  guided  by  Kanizsa-type 
subjective  contours  and  by  subjective  contours  induced  by  line  ends.  This  is  a  topic  with 
some  history  (Davis  &  Driver,  1994;  Davis  &  Driver,  1998;  Gumsey,  Humphrey,  & 
Kapitan,  1992;  Gumsey,  Poirier,  &  Gascon,  1996).  In  our  work,  unlike  in  previous 
experiments,  we  compared  search  performance  with  subjective  contours  against 
performance  with  real,  luminance  contours.  Moreover,  observers  searched  for  shapes  and 
orientations  or  shapes  created  by  the  subjective  contours,  rather  than  searching  for  the 
presence  of  the  contours  themselves.  We  replicated  the  usual  finding  that  visual  search  for 
one  orientation  or  shape  among  distractors  of  another  orientation  or  shape  was  efficient 
when  the  items  were  defined  by  luminance  contours.  Search  was  much  less  efficient 
among  items  defined  by  Kanizsa-type  subjective  contours.  However,  search  remained 
efficient  when  the  items  were  defined  by  subjective  contours  induced  by  line  ends.  This 
may  reflect  a  difference  in  the  underlying  neural  computations  that  support  these  types  of 
subjective  contours. 


Wolfe. 
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QuickTime™  and  a 
decompressor 

are  needed  to  see  this  picture. 


These  experiments  involved  stimuli  like  those  shown 
here  in  Figure  Seven.  Os  search  for  a  T  among  Ls.  We 
could  provide  “classic”  guidance  by  telling  Os  that  the 
T,  if  present,  would  be  yellow.  Alternatively,  we  could 
guide  to  the  surface.  Here,  the  cue  would  be  that  the  T 
was  on  a  left-facing  surface.  We  can  certainly  us  some 
sort  of  surface  guidance  in  the  world.  If  you  were 
asked  to  look  for  a  painting,  you  would  look  on  walls, 
not  floors  or  ceilings,  for  example.  We  wanted  to 

know  if  the  two  forms  of  guidance  are  equivalent.  As  the  title  of  the  paper  states,  they  are 
not.  When  a  target  can  lie  on  one  of  many  surfaces,  color  guidance  is  effective  but  surface 
guidance  is  not  (Exp.  1-3  of  the  paper).  We  found  that  there  was  effective  guidance  to 
multiple  cubes  if  all  those  cubes  were  coplanar.  In  that  case.  Os  could  guide  to  the  coplanar 
tops  of  the  cubes  (Exp.  4).  Similarly,  Os  could  guide  to  a  limited  number  of  surfaces  (Exp. 
5).  We  believe  that,  while  surface  guidance  must  exist,  it  is  slow  compared  to  color 
guidance  and  seems  to  be  limited  to  fewer  surfaces  at  one  time. 


Reijnen,  E.,  Pedersini,  R.,  Pinto,  Y.,  Horowitz ,  T.  S.,  &  Wolfe,  J.  M.  (2009).  Pre-Attentive 
Processing  Of  Occlusion  v  Information  foi  Visual  Search.  Attention,  Perception  & 
Psychophysics,  ms  (submittM  Feb  09).  .  _ 


Figure  Eight:  Do  occluded  bars  behave  like  ‘real’  bars  in  visual  search. 


In  this  project,  we  were  concerned  with  the  ability  of  preattentive  processes  to  complete 
contours  behind  occluders.  Searching  for  vertical  among  horizontals  using  stimuli  like 
those  in  Fig.  8a,b  is  very  easy.  What  about  the  stimuli  in  8c,d?  Is  the  orientation  of  those 
occluded  bars  available  to  guide  search  or  not.  Fig  8e,f  show  stimuli  for  control  conditions. 
In  a  rather  long  series  of  experiments,  we  have  found  that  it  is  possible  to  create  conditions 
where  the  occluded  bars  support  efficient  search  and  the  control  conditions  do  not. 
However,  the  effect  is  rather  fi-agile  suggesting  that  the  orientation  signal  created  by  those 
occluded  bars  is  rather  weak. 
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Contextual  Cueing 

About  10  years  ago,  Chun  and  Jiang  (1998)  described  a  new  phenomenon  that  they  called 
“contextual  cueing”.  They  showed  that  RTs  became  faster  if  the  same  search  displays  were 
repeated.  Observers  learned  something  about  the  displays  even  though  theses  were  random 
collections  of  meaningless  stimuli  (e.g.  Ts  among  Ls).  Chim  and  Jiang  (and  many  others 
since)  argued  that  this  was  a  form  of  guidance.  The  believed  that  Os  were  implicitly 
learning  that  they  should  guide  attention  to  this  location  in  the  presence  of  this 
arrangement  of  items.  Such  a  phenomenon  with  such  an  explanation  is  of  obvious 
relevance  to  our  Guided  Search  project. 


Kunar,  M.  A.,  Flusbe 
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Most  of  the  work  on  contextual  cueing  (CC)  had  used  mean  RT  as  the  measure  of  interest. 
However,  if  one  is  interested  in  guidance,  the  critical  measure  is  the  slope  of  the  RT  x  set 
size  function.  Indeed,  if  CC  provided  perfect  guidance,  slopes  should  drop  to  zero.  The 
display  configuration  would  point  the  observer  directly  to  the  target,  regardless  of  the 
number  of  distractors  present.  We  did  an  extensive  series  of  CC  experiments  with  set  size 
manipulations  and  simply  could  not  find  an  effect  of  CC  on  search  efficiency.  We  could 
replicate  the  basic  CC  effect  on  RT  but  the  slope  did  not  change.  We,  reluctantly,  conclude 
that  CC  is  not  a  form  of  guidance.  Our  guess  is  that  it  is  a  form  of  response  priming.  You 
are  a  bit  faster  to  say  that  the  target  is  the  target  if  it  is  in  a  familiar  setting. 

lilii|||jii|j|j|!||||||||p||||||||ft 

At  some  level,  we  must  be  wrong  about  this.  You  can  certainly  use  your  overt  knowledge 
of  a  scene  to  guide  your  attention.  You  look  for  your  coffee  maker  in  a  specific  location 
because  you  have  learned  and  remembered  that  context.  The  issue  might  be  one  of  time 
scale.  In  another  series  of  experiments,  we  found  that  we  could  get  a  form  of  CC  guidance 
if  we  slowed  search.  If  there  was  enough  time,  knowledge  about  the  layout  of  a  display 
could  be  used  to  direct  attention  to  target  locations. 


We  extended  the  CC  phenomenon  to  what  we  call  global  features.  For  the  bulk  of  the 
studies  report  here,  the  global  feature  was  color.  This  meant  that,  if  the  display  was  red,  the 
target  could  be  found  in  one  location.  If  it  was  blue,  the  target  was  consistently  presented 
in  another  location,  and  so  forth.  The  results  are  similar  in  their  essentials  to  the  classic 
CC.  Os  could  learn  the  association  between  color  and  target  location.  This  learning 
speeded  RTs  but  did  not  change  the  slope  of  the  RT  X  set  size  function,  suggesting  no 
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guidance.  However,  if  a  relatively  long  delay  intervened  between  the  appearance  of  the 
global  cue  and  the  start  of  the  search,  then  search  efficiency  could  be  improved. 

. 

Cottfextual  Cueing,  in  revision. _ :  _ _ 


One  final  piece  of  information  about  contextual  cueing:  Nothing  seems  to  be  learned  on 
target  absent  trials.  On  the  face  of  it,  there  is  no  good  reason  why  Os  should  not  learn  that 
this  display  configuration  means  that  there  no  target.  However,  that  does  not  seem  to  be  the 
case. 

Microsaccades 


Horowitz,  T.  S.,  Fine,  E.  M.,  Fencsik,  D.  E.,  Yurgenson,  S.,  &  JYolfe,  J.  M.  (2007). 
Fixational  Eye  Movements  Are  Not  An  Index  Of  Covert  Attention.  Psychol  Sci,  18iA\ 
356-363.  ..  -x-,  ^  #  rr ■  4;  ' 

Horowitz  ,  T.  S.,  Fencsik,  D.  E.T  Fine,  E.  M.,v Yurgenson,  S.,  &  Wolfe,  J.  M.  (2007). 
Microsaccades  And  Attention:jl)oes  A^Weak  ;Correlation  Make  An  Index?  Reply  To 
Laubrock,  Engbert,  Rolfs,  &  Kliegl  (2007).  Rsyc/;o/ Sb/.  7(5(4),  367-368.  _ 


One  of  our  long-time  fantasies  has  been  that  someone  would  develop  a  “covert  attention 
tracker”,  akin  to  an  eye  tracker.  If  we  assume  that  it  is  being  deployed  from  item  to  item, 
covert  attention  is  being  deployed  at  a  rate  of  something  like  20-40  Hz.  Under  most 
circumstances,  the  deployment  of  the  eyes  is  closely  tied  to  the  deployment  of  attention  but 
at  a  slower  rate  of  3-4  Hz.  Thus,  we  were  excited  by  reports  that  microsaccadic  eye 
movements  might  serve  as  pointers  to  the  loci  of  covert  attention  (Engbert  &  Kliegl,  2003). 
Unfortunately,  in  our  hands,  as  the  title  says  these  “eye  movements  are  not  an  index  of 
covert  attention.”  This  is  not  a  settled  issue  (hence  the  second  article).  However,  we  still 
await  a  method  that  would  allow  tracking  of  covert  attention  within  a  trial. 


AIM  2:  Understanding  the  role  of  memory  in  visual  search:  Standard  serial  models 
of  attention  have  assumed  that  items  in  the  display  are  sampled  without  replacement.  In 
the  previous  grant  period,  we  have  shown  that  the  data  reject  this  assertion  of  perfect 
memory  for  rejected  distractors.  We  have  proposed  that  items  are  sample  with 
replacement  in  typical  search  tasks.  Data  from  other  labs  suggest  the  possibility  that 
some  partial  memory  (perhaps  oculomotor)  discourages  deployment  to  recently 
attended  items.  In  the  next  grant  period,  we  will  investigate  the  theoretical  and  practical 
consequences  of  visual  search  with  limited  memory  for  previous  deployments  of 
attention. 
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Several  aspects  of  the  role  of  memory  in  visual  search  have  interested  us  during  the  grant 
period.  The  original  roots  of  our  interest  lie  in  our  1998  finding  that,  as  we  put  it,  “Visual 
search  has  no  memory.”  (Horowitz  &  Wolfe,  1998).  This  claim  was  based  on  a  series  of 
experiments  in  which  items  in  a  visual  search  task  were  randomly  repositioned  every  100 
msec  (or  every  500  msec  in  some  experiments).  This  makes  it  impossible  to  mark  rejected 
distractors.  If  it  were  the  case  that  rejected  distractors  were  marked  to  eliminate  them  as 
candidate  targets  in  normal  search,  then  random  replotting  should  make  search  efficiency 
significantly  worse.  In  fact,  search  efficiency  is  about  the  same  in  the  standard  static  and 
the  dynamic,  replotting  conditions. 


T,  S.,  &  M.  (2005).  Vistim 


Our  claim  of  no  memory  for  rejected  distractors  has  proven  controversial  (Beck,  Peterson, 
&  Vomela,  2006;  Dukewich  &  Klein,  2005;  Kristjansson,  2000;  McCarley,  Wang,  Kramer, 
Irwin,  &  Peterson,  2003;  Peterson,  Kramer,  Wang,  Irwin,  &  McCarley,  2001;  Shore  & 
Klein,  2000;  von  Muhlenen,  Muller,  &  Muller,  2003).  Though  we  continued  to  find  no 
reliable  evidence  for  marking  of  rejected  distractors  (Horowitz  &  Wolfe,  2001),  others, 
using  different  methods,  found  some.  There  are  various  possibilities.  It  may  be  that  there  is 
memory  attached  to  the  deployment  of  the  eyes,  if  not  to  the  deployment  of  covert 
attention.  It  may  be  that  there  is  some  memory  for  a  few  deployments  of  covert  attention. 
Our  position,  as  argued  in  the  papers  noted  here,  is  that  a  reasonable  consensus  might  be 
that  “visual  search  has  very  little  memory”.  It  may  have  just  enough  to  prevent 
perseveration  in  search.  If  there  is  nothing  biasing  observers  away  from  recently  visited 
items,  it  is  hard  to  see  why  attention  doesn’t  get  stuck  on  the  most  salient  item  in  the  field. 


Memory  &  Search  In  Autistic  Observers 


Visual  Search  Superior  In  Autism;  l%)ectram  Disorder?  Developmental  Science,  in  press, 

U  Jf 

Special  populations  of  observers  can  provide  insight  into  the  mechanisms  of  search.  It  is 
interesting,  therefore,  to  find  that  individuals  diagnosed  with  Autism  Spectrum  Disorder 
(ASD)  outperform  controls  on  visual  search  tasks  (O'Riordan,  Plaisted,  Driver,  &  Baron- 
Cohen,  2001;  Plaisted,  O'Riordan,  &  Baron-Cohen,  1998).  We  wondered  if  this  was  due  to 
superior  memory  for  rejected  distractors.  Maybe  visual  search  does  have  memory  in  the 
autistic  population. 

To  assess  this  possibility,  we  compared  the  performance  of  21  children  with  ASD  and  21 
age-  and  IQ-matched  typically  developing  (TD)  children  in  a  standard  static  search  task 
and  a  dynamic  search  task,  like  those  describe  above,  with  targets  and  distractors  randomly 
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changing  positions  every  500  ms.  The  ASD  observers  had  faster  RTs  than  the  TD  children. 
However,  as  in  our  previous  work,  they  showed  no  disruption  in  search  efficiency  in  the 
dynamic  condition.  If  they  had  memory  for  rejected  distractors  in  the  static  search 
conditions,  then  they  should  have  had  elevated  RT  x  set  size  slopes  in  the  dynamic  case. 
Thus,  it  seems  unlikely  that  memory  for  rejected  distractors  is  the  source  of  the  enhanced 
visual  search  abilities  in  the  ASD  group. 

While  there  were  differences  in  slopes,  there  were  lower  intercepts  for  the  ASD  group  in 
both  static  and  dynamic  search,  suggesting  that  the  ASD  group  has  an  advantage  in  some 
non-search  processes.  We  suspect  that  they  are  faster  to  determine  if  the  current  object  of 
attention  is  a  target  or  a  distractor.  We  gain  some  support  for  this  from  eye-movement  data. 
ASD  and  TD  groups  produced  similar  in  numbers  and  spatial  distributions  of  fixations. 
However,  fixation  duration  was  in  the  ASD  group  as  if  they  needed  to  spend  less  time  on 
each  fixation. 

Why  Don’t  We  Use  Memory? 


As  noted  above,  all  of  these  failures  to  show  an  influence  of  memory  are  a  bit  mysterious 
since  it  is  self-evidently  true  that  we  can  use  memory  in  search.  We  explored  this  puzzle  in 
a  follow-up  on  studies  of  repeated  search  (largely  the  topic  of  a  different  grant  and  not 
extensively  discussed  here).  The  core  observation  is  that  search  efficiency  remains 
essentially  unchanged  even  when  Os  search  through  the  same,  unchanging  display 
hundreds  of  times  (Wolfe,  Klempen,  &  Dahlen,  2000).  In  our  standard,  repeated  search 
experiments.  Os  search  through  an  array  of  letters.  Why  doesn’t  the  search  become  more 
efficient?  Certainly  Os  know  where  the  “K”  or  the  “Q”  are  after  a  few  dozen  trials.  They 
could  do  the  task  with  there  eyes  closed.  We  didn’t  have  Os  close  their  eyes.  We  simply 
removed  the  display  and  had  Os  make  a  localizing  mouse  click  on  the  remembered 
location  of  the  target  letter.  When  Os  make  these  localizing  responses  on  visible  displays, 
the  slope  of  the  RT  x  set  size  fimction  is  about  35  msec/item.  Os  can  do  the  task  with 
memory,  but  the  slope  is  about  100  msec/item.  So,  here  is  a  case  where  Os  do  not  use 
memory  because  it  is  too  slow.  It  is  more  efficient  to  do  the  visual  search  de  novo  than  to 
rely  on  memory. 

Similarly,  within  a  search,  we  suspect  that  memory  processes  are  slow.  They  do  not  play  a 
role  in  a  standard,  laboratory  search  for  a  T  among  Ls.  However,  in  a  real  world  search, 
strategic  planning  can  play  a  role  (e.g.  I  remember  looking  on  the  desk.  The  keys  were  not 
there.  I  will  turn  my  attention  to  the  sofa.) 


AIM  3:  The  relationship  of  different  modes  of  attentional  control.  There  are 
multiple  processes  that  can  control  attention.  Some  of  these  appear  to  be  very  fast. 
Others  are  closely  coupled  with  eye  movements.  The  work  in  Aim  3  is  intended  to 
determine  how  these  share  control  of  visual  attentional  resources. 
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In  the  lab,  we  tend  to  examine  search  (or  other  processes)  in  relative  isolation.  In  the 
world,  several  tasks  may  be  running  at  the  same  time.  The  papers  described  in  this  section 
are  irnited  by  their  concern  with  the  interaction  of  search  with  other  processes. 


Horowitz,.  -T.  pf^s).\;|l*e.'ipe©J.ii|j| 
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We  begin  with  a  set  of  experiments  on  what  we,  somewhat  expansively,  call  “free  will”. 
The  original  question  grew  out  of  our  work  on  memory  in  search.  Recall  that  our  data 
show  little  or  no  memory  for  the  course  of  a  search.  This  seems  odd.  In  a  random  display 
where  the  target  could  be  anywhere,  one  could  produce  the  same  benefits  offered  by 
perfect  memory  if  one  merely  searched  the  display  in  order.  Any  order  will  do,  but  one  can 
imagine  “reading”  the  array  from  upper  left  to  lower  right.  In  an  earlier  paper  (Wolfe, 
Alvarez,  &  Horowitz,  2000),  we  compare  search  under  conditions  that  allowed  covert 
attention  to  be  deployed  in  each  usual  anarchic  maimer  to  other  conditions  in  which 
attention  was  moved  by  an  act  of  volition  from  item  to  item.  We  estimated  the  time 
required  for  each  type  of  deployment  and  found  that  anarchic  deployments  were  fast  (35- 
100  msec  per  shift)  while  volitional  deployments  were  slow  (200-300  msec  per  shift). 

The  2000  paper  was  a  very  short  report.  In  the  2009  paper,  we  report  on  a  much  more 
extensive  set  of  experiments  that  provide  converging  evidence  for  this  point.  Volitional 
control  of  the  deployments  of  attention  is  possible  but  it  is  much  slower  than  the  free- 
running  deployments  that  occur  when  our  personal  “search  engine”  is  given  a  task  and  left 
to  solve  it  in  any  manner  it  chooses.  It  seems  more  than  coincidental  that  the  speed  of 
volitional  attentional  deployments  is  very  similar  to  the  rate  of  saccadic  eye  movements. 
We  suspect  that  the  imderlying  mechanisms  are  related. 
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Pursuing  this  connection  between  movements  of  the  eyes  and  movements  of  attention,  we 
devised  a  task  in  which  attention  either  made  ballistic  (saccadic)  movements  from  point  to 
point  or  tracked  a  moving  target  (pursuit).  We  found  that  pursuit  was  faster  than  saccadic 
jumps  in  this  case.  The  broader  point  is  that  the  control  of  attentional  pursuit  is  different 
than  the  control  of  attentional  saccades. 


Xlvarez,  G.  A.,  H.  C.  Arsenio,  et  al.  (2005).  Do  Multielement  Visual  Tracking  And  Visual 
Search  Draw  Continuously  On  The  Same  Visual  Attention  Resources?  J.  Exo  Psvchol 
Hum  Percent  Perform  31(41: 643-667.  _  ■  .  _ _ _ 


Multiple  object  tracking  (MOT)  can  be  thought  of  a  form  of  attention  pursuit.  In  a  display 
of  several  identical  items,  a  subset  is  briefly  highlighted.  All  items  then  begin  to  move  and 
the  observer’s  task  is  to  track  the  designated  subset.  This  task  certainly  takes  ‘attention’. 
How  is  the  resource,  used  in  MOT,  related  to  the  attentional  resources  required  in  visual 
search?  To  answer  this  question,  we  had  Os  perform  both  tasks  during  a  single  trial.  Os 
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began  tracking  a  set  of  dots.  At  some  point,  a  search  array  appeared  and  Os  were  asked 
perform  a  search  task  while  still  being  able  to  report  on  the  tracked  set  when  asked  later. 
We  used  an  “attentional  operating  characteristic”  (AOC)  method  to  determine  if  the  two 
tasks  interfered  with  each  other  (Sperling  &  Melchner,  1978).  Measured  in  this  manner,  we 
found  that  the  two  tasks  did  not  interfere  any  more  than  other  essentially  independent 
tasks. 

We  did  notice  that  RTs  in  the  search  task  were  much  longer  when  Os  were  also  tracking. 
We  developed  the  hypothesis  that  Os  can  task  switch  between  search  and  MOT.  In  effect, 
it  is  possible  to  put  down  the  tracked  balls,  perform  a  few  hundred  msec  of  search,  and  then 
return  to  the  MOT  task.  This  observation  is  sufficiently  interesting  that  it  has  lead  to  an 
separate  line  of  investigation.  The  primary  funding  for  that  work  comes  from  another 
grant.  However,  it  can  be  seen  as  a  logical  extension  of  the  work  in  Aim  3  on  the 
interaction  of  different  modes  of  attention.  The  primary  publications  in  this  line  are  shown 
below. 


Wolfe,  J.  M.,  Place,  S.  S.,  &  Horowitz,  T.  S.  (2007).  Multiple  Object  Juggling:  ^Changing 
Wha^  Is 


Tracking  And  E 


Psy#iiW#fty^ 


^4|nie^  Role  Of  Location  And 
Of  Moving  Objects.  Percept 


CHAPTERS 

Finally,  we  note  that  AFOSR  funding  was  instrumental  in  the  preparation  of  several  review 
chapters  on  visual  search: 


Wolfe,  J.  M.,  &  Reynolds,  J.  H.  (2008).  Visual  Search.  In  A.  I.  Basbaum,  A.  Kaneko,  G. 
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Contributions  to  SoCietyt  Qctober^OOS:  Worth. 


.  yishal  Search:  Ts  It  A  Matter  Of  Life  And  Death?  ■  In  M.  A. 
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